Recap: Descriptive Statistics in Charlotte's Rental Market¶
Before mapping Charlotte’s rental landscape, we examined descriptive statistics to understand how pricing behaves across unit types, neighborhoods, and complexes. These insights provide a quantitative foundation for interpreting spatial trends.
Key Findings:
Neighborhood-Level Distributions revealed geographic stratification in rent. Uptown and West Charlotte showed elevated medians and wider interquartile ranges, indicating premium pricing and volatility. SouthPark and NoDa displayed lower medians and tighter spreads, reflecting affordability and pricing stability.
Price Per Square Foot (PPSF) reinforced unit-level segmentation. Studios and one-bedrooms commanded higher PPSF values, especially in central locations, suggesting a premium on compact urban living. Larger units offered more space but showed greater variance, highlighting trade-offs between square footage and affordability.
Complex-Level Variability was evident in rent range and IQR comparisons. Properties with narrow distributions reflected standardized layouts and targeted pricing strategies. Complexes with broader spreads suggested diverse unit mixes and flexible positioning, tailored to multiple renter personas.
Unit Size Per Bedroom revealed design intent and livability trade-offs. One-bedroom units consistently offered the most square footage per bedroom, while three-bedrooms offered the least. This inverse relationship reflected cost-efficiency strategies and segmentation by tenant type.
Outlier Detection using IQR and z-scores added analytical precision. IQR flagged broader deviations across bedroom types, while z-scores isolated extreme cases that distorted mean-based metrics. These methods confirmed the presence of atypical inventory and supported segmentation and data cleaning.
Correlation Analysis quantified structural relationships between key variables. Rent correlated moderately with square footage and bedroom count, though the strength varied by unit type. Amenities showed weak to moderate correlations, especially in larger units, suggesting lifestyle features influence pricing but are not primary drivers.
Charlotte’s rental market is both diverse and strategically layered. Descriptive statistics reveal a landscape shaped by affordability, luxury, and design variation. These insights support more accurate forecasting, targeted development, and context-aware pricing strategies.
Next, we turn to geospatial analysis to explore how location influences rent behavior, amenity access, and market segmentation across the city.
Introducing the Geospatial Layer¶
The next phase of our analysis incorporates geospatial techniques to explore how rental pricing behaves across Charlotte’s urban landscape. By layering price per square foot (PPSF) data onto geographic coordinates, we can visualize spatial segmentation, identify pricing clusters, and assess how value and cost align with neighborhood boundaries.
Heatmaps reveal density patterns of high and low PPSF listings, helping us distinguish premium zones from budget-friendly areas.
MarkerClusters provide listing-level granularity, allowing us to examine how unit size, amenities, and location contribute to pricing variation.
Neighborhood Polygons anchor pricing logic in geographic context, showing whether pricing behavior aligns with established submarkets or spills across transitional zones.
PPSF Threshold Segmentation isolates listings above $\$2.00$ and below $\$1.50$, creating a clear contrast between luxury inventory and affordability corridors.
By combining these spatial tools, we move beyond raw metrics to uncover how geography shapes pricing strategy, renter targeting, and market identity. This geospatial layer transforms tabular data into actionable insight, supporting developers, investors, and planners in making location-informed decisions.
Data Loading and Cleaning¶
########>> INTITIALIZE <<########
#basic operation libraries
import os
import sys
import ast
import datetime
import re
import time
#data analysis libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as mcolors
from scipy.stats import zscore
#machine learning libraries
from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler
#geospacial libraries
import folium
from folium.plugins import HeatMap, MarkerCluster
from geopy.geocoders import Nominatim
from geopy.exc import GeocoderTimedOut
from geopy.distance import geodesic
from shapely.geometry import MultiPoint, Polygon
#map styling & HTML Injection
from branca.element import Element
from folium import Html
#web scraping
from bs4 import BeautifulSoup
#display settings for Jupyter display, feel free to edit
from IPython.display import display, HTML
#display settings for Pandas, feel free to edit
pd.set_option('display.html.table_schema', True)
pd.set_option('expand_frame_repr', True)
pd.set_option('display.max_colwidth', 200)
pd.options.display.html.use_mathjax = False
#manage warnings
import warnings
warnings.filterwarnings('ignore')
# using the following script allows me to 'know' when my block of code has completed
print ("\n" + '{:<5} : {:2}'.format("Finished", str(datetime.datetime.now())))
Finished : 2025-10-22 20:50:59.316360
apt = pd.read_csv("C:\\Users\\alexp\\Charlotte_Apartments.csv")
apt = apt.dropna(how='all').reset_index(drop=True)
# Now calculate missing Rent %
missing_rent_pct = apt['Rent'].isna().mean() * 100
print(f"Percentage of missing Rent values: {missing_rent_pct:.2f}%")
Percentage of missing Rent values: 20.00%
apt['Rent'] = apt.groupby(['Complex', 'Bedrooms'])['Rent'].transform(lambda x: x.fillna(x.mean()))
apt['Rent'].isna().mean() * 100
np.float64(0.0)
apt['price_per_sqft'] = apt['Rent'] / apt['Sqft']
apt.head()
| Complex | Address | Unit_Variant | Bedrooms | Bathrooms | Rent | Sqft | Amenities | Website | Neighborhood | price_per_sqft | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Moderna Liberty Row | 7740 Liberty Row Dr, Charlotte, NC 28210 | S01 | 0.0 | 1.0 | 1469.0 | 651.0 | In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... | https://www.moderalibertyrow.com/ | SouthPark | 2.256528 |
| 1 | Moderna Liberty Row | 7740 Liberty Row Dr, Charlotte, NC 28210 | A01 | 1.0 | 1.0 | 1707.0 | 747.0 | In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... | https://www.moderalibertyrow.com/ | SouthPark | 2.285141 |
| 2 | Moderna Liberty Row | 7740 Liberty Row Dr, Charlotte, NC 28210 | A02 | 1.0 | 1.0 | 1707.0 | 747.0 | In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... | https://www.moderalibertyrow.com/ | SouthPark | 2.285141 |
| 3 | Moderna Liberty Row | 7740 Liberty Row Dr, Charlotte, NC 28210 | A03 | 1.0 | 1.0 | 1532.0 | 801.0 | In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... | https://www.moderalibertyrow.com/ | SouthPark | 1.912609 |
| 4 | Moderna Liberty Row | 7740 Liberty Row Dr, Charlotte, NC 28210 | A04 | 1.0 | 1.0 | 1766.0 | 861.0 | In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... | https://www.moderalibertyrow.com/ | SouthPark | 2.051103 |
Base Map of Apartment Listings¶
geolocator = Nominatim(user_agent="apt_mapper", timeout=10) # Increased timeout
def geocode_address(address, retries=3):
for attempt in range(retries):
try:
return geolocator.geocode(address)
except GeocoderTimedOut:
print(f"Timeout, retrying ({attempt+1}/{retries}): {address}")
time.sleep(2)
print(f"Failed to geocode: {address}")
return None
# Initialize map centered on Charlotte
charlotte_map = folium.Map(location=[35.2271, -80.8431], zoom_start=12)
for _, row in apt.iterrows():
address = row['Address']
location = geocode_address(address)
if location:
folium.Marker(
location=[location.latitude, location.longitude],
popup=f"{row['Complex']}<br>Neighborhood: {row['Neighborhood']}<br>Address: {row['Address']}",
tooltip=row['Complex']
).add_to(charlotte_map)
time.sleep(1) # Polite pause between requests
# Save map
charlotte_map.save("charlotte_apartments_map.html")
charlotte_map
This initial map offers a spatial overview of Charlotte’s rental dataset, plotting all apartment listings without pricing overlays. It establishes the geographic footprint of the data and sets the stage for deeper spatial analysis.
Listings are distributed across the city, with visible concentrations in Uptown, South End, and along key transit corridors. These clusters reflect high-density development zones and areas of elevated rental activity. Peripheral gaps suggest lower listing volume or less multifamily saturation, particularly in suburban and industrial zones.
Spatial coverage reveals the dataset’s reach, while density patterns hint at market segmentation. High-density zones often coincide with walkable neighborhoods, mixed-use developments, and lifestyle districts. Sparse areas may reflect zoning constraints, affordability pockets, or emerging submarkets not yet saturated with listings.
This map serves as a visual anchor for the analysis that follows. By establishing where rental activity is concentrated, it allows viewers to interpret pricing logic, clustering behavior, and outlier detection in geographic context. It also supports stakeholder orientation, helping developers, investors, and planners locate areas of interest before layering in pricing or segmentation data.
PPSF Gradient Map of Apartment Listings¶
def get_color(price_per_sqft):
if price_per_sqft < 1.5:
return 'green'
elif price_per_sqft < 2.0:
return 'orange'
else:
return 'red'
for _, row in apt.iterrows():
address = row['Address']
location = geocode_address(address)
if location:
price_per_sqft = row['price_per_sqft']
color = get_color(price_per_sqft) if price_per_sqft else 'gray'
folium.Marker(
location=[location.latitude, location.longitude],
popup=f"{row['Complex']}<br>Neighborhood: {row['Neighborhood']}<br>PPSF: ${price_per_sqft:.2f}<br>Address: {row['Address']}",
tooltip=row['Neighborhood'],
icon=folium.Icon(color=color, icon='home', prefix='fa') # FontAwesome 'home' icon
).add_to(charlotte_map)
time.sleep(1)
apt['Latitude'] = None
apt['Longitude'] = None
for idx, row in apt.iterrows():
address = row['Address']
location = geocode_address(address)
if location:
apt.at[idx, 'Latitude'] = location.latitude
apt.at[idx, 'Longitude'] = location.longitude
time.sleep(1)
neighborhood_coords = apt.groupby('Neighborhood')[['Latitude', 'Longitude']].mean()
for name, coords in neighborhood_coords.iterrows():
folium.Marker(
location=[coords['Latitude'], coords['Longitude']],
icon=folium.DivIcon(html=f"<div style='font-size:12px;color:black;'>{name}</div>")
).add_to(charlotte_map)
charlotte_map.save("charlotte_map.html")
# Load the saved map
with open("charlotte_map.html", "r", encoding="utf-8") as f:
html = f.read()
# Inject the legend div before the closing </body> tag
legend_html = """
<div style="
position: fixed;
bottom: 50px;
left: 50px;
width: 180px;
height: 120px;
background-color: white;
border:2px solid grey;
z-index:9999;
font-size:14px;
padding: 10px;
box-shadow: 2px 2px 6px rgba(0,0,0,0.3);
">
<b>PPSF Legend</b><br>
<i class="fa fa-home fa-1x" style="color:green"></i> < $1.50<br>
<i class="fa fa-home fa-1x" style="color:orange"></i> $1.50 – $2.00<br>
<i class="fa fa-home fa-1x" style="color:red"></i> > $2.00
</div>
"""
soup = BeautifulSoup(html, "html.parser")
soup.body.append(BeautifulSoup(legend_html, "html.parser"))
# Save the modified file
with open("charlotte_map_with_legend.html", "w", encoding="utf-8") as f:
f.write(str(soup))
legend_html = """
<div style="
position: fixed;
bottom: 50px;
left: 50px;
width: 180px;
height: 120px;
background-color: white;
border:2px solid grey;
z-index:9999;
font-size:14px;
padding: 10px;
box-shadow: 2px 2px 6px rgba(0,0,0,0.3);
">
<b>PPSF Legend</b><br>
<i class="fa fa-home fa-1x" style="color:green"></i> < $1.50<br>
<i class="fa fa-home fa-1x" style="color:orange"></i> $1.50 – $2.00<br>
<i class="fa fa-home fa-1x" style="color:red"></i> > $2.00
</div>
"""
legend = folium.Element(legend_html)
charlotte_map.get_root().html.add_child(legend)
charlotte_map
This second map introduces spatial pricing variation by color-coding apartment listings according to price per square foot (PPSF). It transforms raw rent data into a visual gradient, allowing viewers to intuitively interpret affordability, premium zones, and pricing anomalies across Charlotte.
High-PPSF units cluster in central neighborhoods such as Uptown, South End, and Dilworth, reflecting elevated demand for compact, amenity-rich living. These areas show strong color saturation, indicating consistent pricing premiums. In contrast, peripheral zones like Mallard Creek and Tyvola exhibit cooler tones, signaling lower PPSF and broader affordability.
The gradient also supports visual detection of outliers and pricing corridors. Isolated high-PPSF listings in midrange zones may represent branded offerings or transitional submarkets. Conversely, low-PPSF units in premium areas may signal underpriced inventory or legacy buildings.
This map bridges statistical insight with geographic context. By layering PPSF onto spatial coordinates, it reveals how pricing logic varies across the city and sets the stage for deeper segmentation through clustering and boundary overlays.
PPSF Map with Custom Neighborhood Boundaries¶
# --------------------------
# 1. Apartment Data Cleanup
# --------------------------
unique_apartments = apt[["Complex", "Address"]].drop_duplicates().reset_index(drop=True)
apt_df = pd.DataFrame(unique_apartments)
apt_df = apt_df.dropna(subset=["Complex", "Address"]).reset_index(drop=True)
# --------------------------
# 2. Neighborhood Landmarks
# --------------------------
landmarks_df = pd.DataFrame({
"Neighborhood": [
"Uptown", "South End", "NoDa", "SouthPark",
"University City", "West Charlotte", "South Charlotte"
],
"Landmark": [
"Bank of America Stadium",
"Atherton Mill & Market",
"The Evening Muse",
"SouthPark Mall",
"UNC Charlotte Main Campus",
"Freedom Park",
"Carowinds Theme Park"
],
"Address": [
"800 S Mint St, Charlotte, NC 28202",
"2140 South Blvd, Charlotte, NC 28203",
"322 E 36th St, Charlotte, NC 28205",
"4400 Sharon Rd, Charlotte, NC 28211",
"9201 University City Blvd, Charlotte, NC 28223",
"1900 East Blvd, Charlotte, NC 28203",
"14523 Carowinds Blvd, Charlotte, NC 28273"
]
})
# --------------------------
# 3. Geocoding Setup
# --------------------------
geolocator = Nominatim(user_agent="charlotte_geo")
def geocode_address(address):
try:
location = geolocator.geocode(address)
if location:
return (location.latitude, location.longitude)
else:
return (None, None)
except:
return (None, None)
# --------------------------
# 4. Geocode Landmarks
# --------------------------
landmarks_df["Coordinates"] = landmarks_df["Address"].apply(geocode_address)
time.sleep(1)
# --------------------------
# 5. Geocode Apartments
# --------------------------
apt_df["Coordinates"] = apt_df["Address"].apply(geocode_address)
time.sleep(1)
# --------------------------
# 6. Assign Neighborhoods
# --------------------------
def assign_neighborhood(apartment_coord):
if not apartment_coord or None in apartment_coord:
return None
min_distance = float('inf')
closest_neighborhood = None
for _, row in landmarks_df.iterrows():
landmark_coord = row["Coordinates"]
if not landmark_coord or None in landmark_coord:
continue
distance = geodesic(apartment_coord, landmark_coord).miles
if distance < min_distance:
min_distance = distance
closest_neighborhood = row["Neighborhood"]
return closest_neighborhood
apt_df["Neighborhood"] = apt_df["Coordinates"].apply(assign_neighborhood)
# --------------------------
# 7. Initialize Map
# --------------------------
charlotte_map = folium.Map(location=[35.2271, -80.8431], zoom_start=12)
# --------------------------
# 8. Visualize Neighborhood Zones
# --------------------------
for _, row in landmarks_df.iterrows():
coord = row["Coordinates"]
if None not in coord:
folium.Circle(
location=coord,
radius=2000, # meters
color='gray',
fill=True,
fill_color='lightblue',
fill_opacity=0.3,
popup=None,
tooltip=None
).add_to(charlotte_map)
# --------------------------
# 9. Display Final DataFrame
# --------------------------
apt_df[["Complex", "Address", "Neighborhood"]]
| Complex | Address | Neighborhood | |
|---|---|---|---|
| 0 | Moderna Liberty Row | 7740 Liberty Row Dr, Charlotte, NC 28210 | SouthPark |
| 1 | Tyvola Tapestry | 2051 Establishment Wy, Charlotte, NC 28217 | SouthPark |
| 2 | The Landon | 8200 Riverbirch Dr, Charlotte, NC 28210 | SouthPark |
| 3 | Hawkins Press | 2200 Dunavant St, Charlotte, NC 28203 | South End |
| 4 | Novel Mallard Creek | 9132 Senator Royall Dr, Charlotte, NC 28262 | University City |
| 5 | Ello House | 3615 Tryclan Dr, Charlotte, NC 28217 | South End |
| 6 | The Henry | 404 W 26th St, Charlotte, NC 28206 | NoDa |
| 7 | The Leo LoSo | 4520 Charlotte Park Drive Charlotte, NC 28217 | South End |
| 8 | The Perch | 718 Gesco St, Charlotte, NC 28208 | Uptown |
| 9 | Bond on Mint | 1007 S Mint St, Charlotte, NC 28203 | Uptown |
| 10 | Solis Midtown | 1133 Harding Pl, Charlotte, NC 28204 | West Charlotte |
| 11 | Broadstone Craft | 1015 N Alexander Street, Charlotte, NC 28206 | Uptown |
# --------------------------
# 1. Apartment Data Cleanup
# --------------------------
unique_apartments = apt[["Complex", "Address", "price_per_sqft"]].drop_duplicates().reset_index(drop=True)
apt_df = pd.DataFrame(unique_apartments)
apt_df = apt_df.dropna(subset=["Complex", "Address"]).reset_index(drop=True)
# --------------------------
# 2. Neighborhood Landmarks
# --------------------------
landmarks_df = pd.DataFrame({
"Neighborhood": [
"Uptown", "South End", "NoDa", "SouthPark",
"University City", "West Charlotte", "South Charlotte"
],
"Landmark": [
"Bank of America Stadium",
"Atherton Mill & Market",
"The Evening Muse",
"SouthPark Mall",
"UNC Charlotte Main Campus",
"Freedom Park",
"Carowinds Theme Park"
],
"Address": [
"800 S Mint St, Charlotte, NC 28202",
"2140 South Blvd, Charlotte, NC 28203",
"322 E 36th St, Charlotte, NC 28205",
"4400 Sharon Rd, Charlotte, NC 28211",
"9201 University City Blvd, Charlotte, NC 28223",
"1900 East Blvd, Charlotte, NC 28203",
"14523 Carowinds Blvd, Charlotte, NC 28273"
]
})
# --------------------------
# 3. Geocoding Setup
# --------------------------
geolocator = Nominatim(user_agent="charlotte_geo")
def geocode_address(address):
try:
location = geolocator.geocode(address)
if location:
return (location.latitude, location.longitude)
else:
return (None, None)
except:
return (None, None)
# --------------------------
# 4. Geocode Landmarks
# --------------------------
landmarks_df["Coordinates"] = landmarks_df["Address"].apply(geocode_address)
time.sleep(1)
# --------------------------
# 5. Geocode Apartments
# --------------------------
apt_df["Coordinates"] = apt_df["Address"].apply(geocode_address)
time.sleep(1)
# --------------------------
# 6. Assign Neighborhoods
# --------------------------
def assign_neighborhood(apartment_coord):
if not apartment_coord or None in apartment_coord:
return None
min_distance = float('inf')
closest_neighborhood = None
for _, row in landmarks_df.iterrows():
landmark_coord = row["Coordinates"]
if not landmark_coord or None in landmark_coord:
continue
distance = geodesic(apartment_coord, landmark_coord).miles
if distance < min_distance:
min_distance = distance
closest_neighborhood = row["Neighborhood"]
return closest_neighborhood
apt_df["Neighborhood"] = apt_df["Coordinates"].apply(assign_neighborhood)
# --------------------------
# 7. Initialize Map
# --------------------------
charlotte_map = folium.Map(location=[35.2271, -80.8431], zoom_start=12)
# --------------------------
# 8. Visualize Neighborhood Zones with Convex Hulls
# --------------------------
neighborhood_groups = apt_df.dropna(subset=["Neighborhood", "Coordinates"]).groupby("Neighborhood")
for neighborhood, group in neighborhood_groups:
coords = group["Coordinates"].apply(lambda x: (x[1], x[0])).tolist() # (lon, lat)
if len(coords) < 3:
centroid = np.mean(np.array(coords), axis=0)
folium.Circle(
location=(centroid[1], centroid[0]),
radius=1000,
color='gray',
fill=True,
fill_color='lightblue',
fill_opacity=0.3
).add_to(charlotte_map)
# Add label
folium.Marker(
location=(centroid[1], centroid[0]),
icon=folium.DivIcon(html=f"""
<div style='font-size:12px; color:black; font-weight:bold;'>{neighborhood}</div>
""")
).add_to(charlotte_map)
continue
hull = MultiPoint(coords).convex_hull
if isinstance(hull, Polygon):
folium.Polygon(
locations=[(pt[1], pt[0]) for pt in hull.exterior.coords],
color='gray',
fill=True,
fill_color='lightblue',
fill_opacity=0.3
).add_to(charlotte_map)
# Add label at polygon centroid
centroid = hull.centroid
folium.Marker(
location=(centroid.y, centroid.x),
icon=folium.DivIcon(html=f"""
<div style='font-size:12px; color:black; font-weight:bold;'>{neighborhood}</div>
""")
).add_to(charlotte_map)
# --------------------------
# 9. Add Apartment Markers Colored by price_per_sqft
# --------------------------
def get_color(price_per_sqft):
if price_per_sqft is None or pd.isna(price_per_sqft):
return 'gray'
elif price_per_sqft < 1.5:
return 'green'
elif price_per_sqft < 2.0:
return 'orange'
else:
return 'red'
for _, row in apt_df.iterrows():
coord = row["Coordinates"]
if None not in coord:
price = row.get("price_per_sqft", None)
color = get_color(price)
popup_text = f"{row['Complex']}<br>{row['Address']}"
if price is not None:
popup_text += f"<br>Price/SqFt: ${price:.2f}"
folium.Marker(
location=coord,
icon=folium.Icon(color=color, icon="home", prefix="fa"),
popup=popup_text
).add_to(charlotte_map)
# --------------------------
# 10. Add PPSF Legend
# --------------------------
legend_html = """
<div style="
position: fixed;
bottom: 50px;
left: 50px;
width: 180px;
height: 140px;
background-color: white;
border:2px solid grey;
z-index:9999;
font-size:14px;
padding: 10px;
box-shadow: 2px 2px 6px rgba(0,0,0,0.3);
">
<b>Price per SqFt</b><br>
<i class="fa fa-home fa-1x" style="color:green"></i> < $1.50<br>
<i class="fa fa-home fa-1x" style="color:orange"></i> $1.50 – $2.00<br>
<i class="fa fa-home fa-1x" style="color:red"></i> > $2.00<br>
<i class="fa fa-home fa-1x" style="color:gray"></i> Unknown
</div>
"""
charlotte_map.get_root().html.add_child(Element(legend_html))
# --------------------------
# 11. Display the Map
# --------------------------
charlotte_map
This third map overlays price per square foot (PPSF) onto Charlotte’s rental landscape while incorporating custom neighborhood boundaries. It anchors pricing logic in geographic context, allowing for a more nuanced interpretation of spatial segmentation.
By combining PPSF gradients with neighborhood outlines, the map reveals how pricing behavior aligns or misaligns with established submarket identities. In some areas, such as Uptown and South End, high PPSF values closely follow neighborhood boundaries, validating their role as premium zones. In other cases, pricing gradients spill across borders, suggesting transitional markets or emerging micro-neighborhoods.
This overlay also supports cluster interpretation. Listings within the same neighborhood may belong to different pricing clusters, highlighting internal segmentation and diverse renter targeting. Conversely, similar PPSF values across adjacent neighborhoods may indicate broader pricing corridors or shared development logic.
For stakeholders, this map enhances storytelling and strategic insight. Developers can assess whether pricing supports branding claims. Investors can identify undervalued zones within high-demand areas. Planners can spot affordability gaps and zoning effects. By contextualizing pricing within geographic boundaries, the map transforms raw metrics into actionable market intelligence.
Neighborhood Price Per Sqft with Median Pricing Overlays¶
# --------------------------
# 0. Geocoder Setup
# --------------------------
geolocator = Nominatim(user_agent="charlotte_geo")
def geocode_address(address):
try:
location = geolocator.geocode(address)
if location:
return (location.latitude, location.longitude)
else:
return None
except:
return None
# --------------------------
# 1. Recompute Coordinates
# --------------------------
apt["Coordinates"] = apt["Address"].apply(lambda x: geocode_address(x) if pd.notna(x) else None)
time.sleep(1)
# --------------------------
# 2. Filter Outliers
# --------------------------
filtered_df = apt[
(apt["price_per_sqft"] >= 0.5) &
(apt["price_per_sqft"] <= 5.0) &
(apt["Neighborhood"].notna()) &
(apt["Coordinates"].notna())
]
# --------------------------
# 3. Aggregate Metrics
# --------------------------
agg_df = filtered_df.groupby("Neighborhood").agg(
Median_PPSF=("price_per_sqft", "median"),
Avg_Rent=("Rent", "mean"),
Unit_Count=("Complex", "count")
).reset_index()
# --------------------------
# 4. Build Polygon Dictionary
# --------------------------
neighborhood_polygons = {}
for neighborhood, group in filtered_df.groupby("Neighborhood"):
coords = group["Coordinates"].apply(lambda x: (x[1], x[0])).tolist() # (lon, lat)
if len(coords) >= 3:
polygon = MultiPoint(coords).convex_hull
neighborhood_polygons[neighborhood] = polygon
# --------------------------
# 5. Normalize PPSF for Color Scale
# --------------------------
norm = mcolors.Normalize(
vmin=agg_df["Median_PPSF"].min(),
vmax=agg_df["Median_PPSF"].max()
)
cmap = cm.get_cmap("YlOrRd")
# --------------------------
# 6. Initialize Map
# --------------------------
choropleth_map = folium.Map(location=[35.2271, -80.8431], zoom_start=12)
# --------------------------
# 7. Add Neighborhood Polygons with Choropleth Shading
# --------------------------
for _, row in agg_df.iterrows():
neighborhood = row["Neighborhood"]
polygon = neighborhood_polygons.get(neighborhood)
if polygon:
color = mcolors.to_hex(cmap(norm(row["Median_PPSF"])))
popup_text = (
f"<b>{neighborhood}</b><br>"
f"Median PPSF: ${row['Median_PPSF']:.2f}<br>"
f"Avg Rent: ${row['Avg_Rent']:.0f}<br>"
f"Units: {row['Unit_Count']}"
)
# Only draw if it's a Polygon
if isinstance(polygon, Polygon):
folium.Polygon(
locations=[(pt[1], pt[0]) for pt in polygon.exterior.coords],
color=color,
fill=True,
fill_color=color,
fill_opacity=0.6,
popup=popup_text
).add_to(choropleth_map)
# Add label at centroid
centroid = polygon.centroid
folium.Marker(
location=(centroid.y, centroid.x),
icon=folium.DivIcon(html=f"""
<div style='font-size:12px; color:black; font-weight:bold;'>{neighborhood}</div>
""")
).add_to(choropleth_map)
# --------------------------
# 8. Add Color Legend
# --------------------------
legend_html = """
<div style="
position: fixed;
bottom: 50px;
left: 50px;
width: 200px;
height: 140px;
background-color: white;
border:2px solid grey;
z-index:9999;
font-size:14px;
padding: 10px;
box-shadow: 2px 2px 6px rgba(0,0,0,0.3);
">
<b>Median PPSF by Neighborhood</b><br>
<i style="background:#ffffb2;width:20px;height:10px;display:inline-block;"></i> Low<br>
<i style="background:#fecc5c;width:20px;height:10px;display:inline-block;"></i> Medium<br>
<i style="background:#fd8d3c;width:20px;height:10px;display:inline-block;"></i> High<br>
<i style="background:#e31a1c;width:20px;height:10px;display:inline-block;"></i> Very High
</div>
"""
choropleth_map.get_root().html.add_child(Element(legend_html))
# --------------------------
# 9. Display the Map
# --------------------------
choropleth_map
This map visualizes Charlotte’s core rental neighborhoods—Uptown, South End, and SouthPark—using color-coded boundaries based on median price per square foot (PPSF). Each neighborhood is shaded according to its PPSF value, with interactive tooltips displaying both median PPSF and average rent. The visualization focuses on areas with sufficient listing density to support meaningful statistical comparison.
Uptown neighborhood shows the highest median at $\$2.48$/sqft and an average rent of $\$2,121$ across 63 units. The deep saturation reflects its premium status, driven by walkability, skyline views, and proximity to employment centers. Uptown’s pricing confirms its role as Charlotte’s luxury core.
South End with a median PPSF of $\$2.38$ and an average rent of $\$2,035$ across 82 units, South End closely trails Uptown. Its pricing reflects strong demand for lifestyle-oriented living, with access to breweries, retail, and transit. The neighborhood’s slightly lower PPSF suggests a mix of luxury and midrange offerings.
SouthPark area shows a median PPSF of $\$1.69$ and an average rent of $\$1,675$ across 53 units. The lighter shading indicates broader affordability, despite its reputation as a retail and office hub. SouthPark’s lower PPSF may reflect larger unit sizes, suburban layouts, or legacy inventory.
This map reveals how pricing logic varies across Charlotte’s most active rental zones. By anchoring PPSF in geographic context, it validates Uptown and South End as premium submarkets while positioning SouthPark as a more affordable alternative. The tooltips provide quick access to summary metrics, supporting stakeholder interpretation and strategic comparison.
Why Other Neighborhoods Are Excluded¶
While the dataset includes properties in West Charlotte, NoDa, and University City, each of these neighborhoods contains only one complex. With such limited data, calculating reliable median PPSF or average rent would be statistically misleading.
To ensure analytical integrity, the map highlights only Uptown, South End, and SouthPark. These neighborhoods contain between 53 and 82 units each, allowing for robust summary metrics and clearer spatial interpretation. As more listings become available in lower-density neighborhoods, future iterations of the map may expand to include them. For now, the focus remains on areas with enough data to support confident analysis and stakeholder insight.
Heatmaps of Value vs. Cost¶
# High PPSF: > $2.00
high_ppsf_df = apt[(apt["price_per_sqft"] > 2.0) & apt["Coordinates"].notna()]
# Low PPSF: < $1.50
low_ppsf_df = apt[(apt["price_per_sqft"] < 1.5) & apt["Coordinates"].notna()]
# Extract coordinates for heatmap
high_coords = [list(coord) for coord in high_ppsf_df["Coordinates"]]
low_coords = [list(coord) for coord in low_ppsf_df["Coordinates"]]
# Initialize maps
high_map = folium.Map(location=[35.2271, -80.8431], zoom_start=12)
low_map = folium.Map(location=[35.2271, -80.8431], zoom_start=12)
# Add heatmaps
HeatMap(high_coords, radius=12, blur=15, max_zoom=1).add_to(high_map)
HeatMap(low_coords, radius=12, blur=15, max_zoom=1).add_to(low_map)
<folium.plugins.heat_map.HeatMap at 0x237339fe4f0>
# High PPSF overlays
high_cluster = MarkerCluster().add_to(high_map)
for _, row in high_ppsf_df.iterrows():
popup = f"{row['Complex']}<br>{row['Address']}<br>{row['price_per_sqft']:.2f} PPSF"
if "sqft" in row and pd.notna(row["sqft"]):
popup += f"<br>Size: {row['sqft']} sqft"
if "Amenities" in row and pd.notna(row["Amenities"]):
popup += f"<br>Amenities: {row['Amenities']}"
folium.Marker(location=row["Coordinates"], popup=popup).add_to(high_cluster)
# Low PPSF overlays
low_cluster = MarkerCluster().add_to(low_map)
for _, row in low_ppsf_df.iterrows():
popup = f"{row['Complex']}<br>{row['Address']}<br>{row['price_per_sqft']:.2f} PPSF"
if "sqft" in row and pd.notna(row["sqft"]):
popup += f"<br>Size: {row['sqft']} sqft"
if "Amenities" in row and pd.notna(row["Amenities"]):
popup += f"<br>Amenities: {row['Amenities']}"
folium.Marker(location=row["Coordinates"], popup=popup).add_to(low_cluster)
# Build polygons from listing coordinates
neighborhood_polygons = {}
for neighborhood, group in apt.groupby("Neighborhood"):
coords = group["Coordinates"].dropna().apply(lambda x: (x[1], x[0])).tolist() # (lon, lat)
if len(coords) >= 3:
polygon = MultiPoint(coords).convex_hull
neighborhood_polygons[neighborhood] = polygon
# Add to map
for name, poly in neighborhood_polygons.items():
if isinstance(poly, Polygon):
folium.Polygon(
locations=[(pt[1], pt[0]) for pt in poly.exterior.coords],
color="gray",
weight=1,
fill=True, # 👈 Must be True to activate hover inside
fill_opacity=0.1, # 👈 Light fill so hover works but map stays readable
tooltip=name # 👈 This activates hover inside the polygon
).add_to(high_map)
folium.Polygon(
locations=[(pt[1], pt[0]) for pt in poly.exterior.coords],
color="gray",
weight=1,
fill=True,
fill_opacity=0.1,
tooltip=name
).add_to(low_map)
High Price Per Sqft Map¶
high_map
This map visualizes Charlotte apartment listings with price per square foot (PPSF) greater than $\$2.00$, highlighting the city’s premium rental inventory. Listings are color-coded by PPSF intensity, with a heatmap layer showing density and a MarkerCluster overlay displaying individual unit details. Each marker includes the complex name, address, PPSF, and when available, unit size and amenities. Neighborhood polygons are lightly shaded and labeled with tooltips to provide spatial context.
Moderna Liberty Row, Tyvola Tapestry, The Leo Loso, Ello House, Hawkins Press, Solis Midtown, Bond on Mint, The Perch, Broadstone Craft, The Henry, and Novel Mallard Creek all appear on this map. These complexes contain a mix of unit-level pricing. Some units exceed the $2.00 PPSF threshold and are included here, while others fall below and are captured in the low PPSF visualization. This internal variability suggests strategic segmentation within buildings, possibly based on unit size, floor level, view, renovation status, or amenity access.
Uptown shows the most concentrated high-PPSF activity, reinforcing its role as Charlotte’s luxury core. South End also displays strong saturation, reflecting demand for lifestyle-oriented living. Scattered high-PPSF units in transitional zones suggest emerging micro-markets or niche offerings.
This map reveals how premium pricing varies not only across neighborhoods but also within individual complexes. By layering PPSF intensity with spatial context, it supports nuanced analysis of Charlotte’s luxury rental landscape. For developers and investors, this segmentation offers insight into pricing strategy, unit differentiation, and geographic positioning.
Low Price Per Sqft Map¶
low_map
This map visualizes Charlotte apartment listings with price per square foot (PPSF) less than $\$1.50$, highlighting the city’s more affordable rental inventory. Listings are color-coded by PPSF intensity, with a heatmap layer showing density and a MarkerCluster overlay displaying individual unit details. Each marker includes the complex name, address, PPSF, and when available, unit size and amenities. Neighborhood polygons are lightly shaded and labeled with tooltips to provide spatial context.
The Landon appears prominently on this map, along with select units from Moderna Liberty Row, The Leo Loso, and The Henry. These properties contain a mix of unit-level pricing. While some units fall below the $1.50 PPSF threshold and are included here, others exceed it and are captured in the high PPSF visualization. This internal variability suggests pricing differentiation within buildings, potentially based on unit type, renovation status, or strategic leasing decisions.
Low-PPSF listings are more dispersed, with fewer concentrated clusters compared to the high PPSF map. This spatial pattern may reflect broader affordability in transitional or suburban zones, or legacy inventory within otherwise premium complexes. The polygon overlays help anchor these listings in neighborhood context, allowing viewers to assess whether affordability is localized, scattered, or emerging.
This map reveals how budget-friendly pricing is distributed across Charlotte’s rental landscape. By layering PPSF intensity with spatial context, it supports analysis of affordability corridors, pricing anomalies, and internal segmentation within complexes. For planners and housing advocates, this visualization offers insight into where lower-cost units are located and how they relate to broader market dynamics.
Price Per Sqft Segmentation Overview¶
This paired map analysis reveals how Charlotte’s rental market stratifies spatially and structurally across price per square foot (PPSF). By isolating listings above $\$2.00$ and below $\$1.50$, the maps highlight the city’s premium and budget inventory bands, offering a clear view of pricing segmentation.
High PPSF listings are concentrated in Uptown, South End, and adjacent lifestyle districts. These clusters reflect Charlotte’s luxury core, where location, amenities, and design drive elevated pricing. Complexes such as Moderna Liberty Row, Solis Midtown, and Bond on Mint show internal variation, with some units priced above $\$2.00$ and others below. This suggests strategic segmentation within buildings, likely based on unit features, renovation status, or leasing strategy.
Low PPSF listings are more dispersed, appearing in transitional zones and suburban edges. The Landon anchors this map, alongside select units from The Leo Loso, The Henry, and Moderna Liberty Row. These listings reflect broader affordability, legacy inventory, or pricing strategies aimed at value-conscious renters. The lower density of clusters suggests that budget-friendly units are less spatially concentrated, often embedded within mixed-price complexes.
Neighborhood polygons provide geographic context across both maps, helping viewers assess whether pricing behavior aligns with known submarkets or spills across boundaries. The marker popups add qualitative depth, showing how unit size and amenities contribute to pricing variation.
Together, these maps reveal a rental landscape shaped by both geography and internal building logic. They support nuanced analysis of pricing strategy, affordability corridors, and market segmentation, offering developers, investors, and planners a spatial framework for decision-making.
Cluster Analysis¶
# Extract coordinates and PPSF
df = apt[apt["Coordinates"].notna()]
X = np.array([[coord[0], coord[1], ppsf] for coord, ppsf in zip(df["Coordinates"], df["price_per_sqft"])])
# Scale features
X_scaled = StandardScaler().fit_transform(X)
df = apt[apt["Coordinates"].notna()].copy()
db = DBSCAN(eps=0.5, min_samples=10).fit(X_scaled)
df["Cluster"] = db.labels_
cluster_tags = {}
for label in df["Cluster"].unique():
if label == -1:
cluster_tags[label] = "Noise"
continue
cluster_df = df[df["Cluster"] == label]
avg_ppsf = cluster_df["price_per_sqft"].mean()
if avg_ppsf > 2.5:
cluster_tags[label] = "Luxury Core"
elif avg_ppsf < 1.5:
cluster_tags[label] = "Budget Fringe"
else:
cluster_tags[label] = "Amenity-Rich Midrange"
df["Cluster_Label"] = df["Cluster"].map(cluster_tags)
import folium
from folium.plugins import MarkerCluster
cluster_map = folium.Map(location=[35.2271, -80.8431], zoom_start=12)
marker_cluster = MarkerCluster().add_to(cluster_map)
for _, row in df.iterrows():
popup = f"{row['Complex']}<br>{row['Address']}<br>{row['price_per_sqft']:.2f} PPSF<br>Cluster: {row['Cluster_Label']}"
folium.Marker(location=row["Coordinates"], popup=popup).add_to(marker_cluster)
cluster_map
Charlotte’s rental market reveals clear spatial segmentation when units are grouped by price per square foot (PPSF). Using DBSCAN clustering, three distinct submarkets emerge, each reflecting a different pricing logic and locational identity.
DBSCAN overview DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering algorithm that groups units based on proximity and similarity. Unlike KMeans, DBSCAN does not require a preset number of clusters. Instead, it identifies organically formed groupings by looking for areas of high density and labeling points that fall outside these zones as noise. This makes it especially useful for spatial data, where clusters may vary in shape, size, and density.
Luxury Core These properties exhibit high PPSF and are typically compact, amenity-rich units in premium locations. Examples include The Leo Loso, Ello House, Hawkins Press, Solis Midtown, and Bond on Mint. Their elevated pricing likely reflects walkability, branding, and access to lifestyle infrastructure.
Amenity-Rich Midrange Units in this cluster offer solid features at moderate PPSF, often located in transitional or peripheral zones. Properties such as Moderna Liberty Row, Broadstone Craft, The Henry, and Novel Mallard Creek fall into this category, suggesting newer builds or strategic pricing in growth corridors.
Budget Fringe This segment includes larger, lower-cost units in less central areas. The Landon is a representative example, offering affordability with fewer premium features or locational advantages.
Noise A small number of listings do not fit into any cluster. The Perch and Tyvola Tapestry are spatially isolated or pricing anomalies, potentially reflecting niche offerings or data irregularities.
Strategic segmentation Clustering transforms raw pricing into contextual identity. Each unit now carries a market role, whether flagship, transitional, or fringe, enabling targeted strategies for investment, development, and policy. This segmentation also supports narrative clarity, allowing stakeholders to interpret pricing behavior through a spatial lens.
Spatial Outlier Detection¶
df = apt[apt["Coordinates"].notna()].copy()
df = df.reset_index(drop=True) # ✅ Aligns index with row positions
# Prepare coordinates in radians
df["lat_rad"] = np.radians(df["Coordinates"].apply(lambda x: x[0]))
df["lon_rad"] = np.radians(df["Coordinates"].apply(lambda x: x[1]))
coords_rad = df[["lat_rad", "lon_rad"]].to_numpy()
# Build BallTree
tree = BallTree(coords_rad, metric="haversine")
radius_km = 0.5
radius_rad = radius_km / 6371.0
z_scores = []
for i, row in df.iterrows():
idx = tree.query_radius([coords_rad[i]], r=radius_rad)[0]
local_ppsf = df.iloc[idx]["price_per_sqft"]
if len(local_ppsf) > 1:
z = (row["price_per_sqft"] - local_ppsf.mean()) / local_ppsf.std()
else:
z = 0
z_scores.append(z)
df["Local_Z"] = z_scores
df["Outlier_Type"] = "Typical"
df.loc[df["Local_Z"] > 1.5, "Outlier_Type"] = "High Outlier"
df.loc[df["Local_Z"] < -1.5, "Outlier_Type"] = "Low Outlier"
outlier_map = folium.Map(location=[35.2271, -80.8431], zoom_start=12)
color_map = {
"High Outlier": "red",
"Low Outlier": "blue",
"Typical": "gray"
}
for _, row in df.iterrows():
folium.CircleMarker(
location=row["Coordinates"],
radius=5,
color=color_map[row["Outlier_Type"]],
fill=True,
fill_opacity=0.7,
popup=f"{row['Complex']}<br>{row['price_per_sqft']:.2f} PPSF<br>{row['Outlier_Type']}"
).add_to(outlier_map)
outlier_map
Most units in Charlotte’s rental market are priced consistently with their immediate surroundings. However, one listing stands out. Using a ±1.5 z-score threshold within a 0.5 km radius, Moderna Liberty Row was flagged as a high PPSF outlier.
Moderna Liberty Row This property rents for approximately $\$2.49$/sqft which is significantly above the local average. Its elevated pricing suggests a premium positioning that may stem from unique amenities, branded finishes, or micro-location advantages such as walkability, transit access, or views.
Emerging dynamics The presence of a spatial outlier in an otherwise stable pricing zone may indicate early signs of submarket transition. It invites closer inspection into whether Moderna Liberty Row is setting a new pricing benchmark or simply reflecting a niche offering within its block.
While most units fall within expected pricing bands, outlier detection helps surface anomalies that might otherwise be overlooked. These listings can signal strategic pricing behavior, misalignment with local norms, or the emergence of micro-markets within broader neighborhoods.
How Spatial Patterns Shape Charlotte's Rental Landscape¶
Geospatial analysis of the data confirms a market shaped by location-driven segmentation, pricing logic, and spatial clustering. By layering price per square foot (PPSF) onto geographic coordinates, we uncovered how geography, density, and unit-level variation interact to define strategic submarkets and investment patterns.
Listing density maps reveal concentrated rental activity in Uptown, South End, and along major transit corridors. These high-demand zones reflect walkable, amenity-rich environments and compact unit layouts, factors that drive elevated PPSF and reinforce their identity as Charlotte’s luxury core. SouthPark, while more affordable, maintains strategic relevance through broader unit sizes and suburban appeal.
PPSF gradients expose affordability corridors and pricing anomalies. Cooler tones in Mallard Creek and Tyvola Tapestry highlight value-oriented inventory and legacy stock, while isolated high-PPSF units in mid-tier zones suggest branded offerings or transitional submarkets. These spatial contrasts help stakeholders interpret pricing behavior through a geographic lens, distinguishing premium saturation from affordability pockets and identifying areas of pricing misalignment.
Neighborhood boundary overlays add precision to submarket interpretation. In some cases, PPSF values align tightly with neighborhood borders, validating branding and market expectations. In others, pricing gradients spill across boundaries, revealing segmentation opportunities and evolving micro-neighborhoods. This misalignment signals areas where pricing may outpace perception, offering strategic entry points for developers and investors.
Geospatial clustering using DBSCAN identifies three distinct submarkets:
Luxury Core: High-PPSF units in central, walkable districts like Uptown and South End, anchored by properties such as The Leo Loso and Solis Midtown
Amenity-Rich Midrange: Moderately priced units in transitional zones, including Moderna Liberty Row and The Henry, reflecting newer builds and strategic positioning
Budget Fringe: Larger, lower-cost units in less central areas, exemplified by The Landon, offering affordability with fewer premium features
Outlier detection adds analytical precision. Moderna Liberty Row emerged as a high-PPSF anomaly within a midrange zone, suggesting branded finishes, unique amenities, or micro-location advantages. Its pricing behavior may reflect early signs of submarket transition or niche market emergence, offering a signal of shifting dynamics within its immediate surroundings.
Overall, the geospatial analysis reveals a rental landscape that is both spatially diverse and strategically segmented. The market accommodates a range of renter preferences, from luxury urban living to affordable suburban options. For analysts, developers, and planners, this spatial clarity supports targeted investment, context-aware pricing, and location-sensitive design strategies.
Regression analysis will build on these spatial insights by quantifying how factors like location, unit size, and amenities influence PPSF. This next phase will help validate observed patterns, test pricing hypotheses, and uncover the structural relationships driving the rental market. By linking spatial segmentation to statistical modeling, we move from mapping patterns to explaining them.